{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# <center/>MindInsight的溯源分析和对比分析体验"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 概述\n",
    "\n",
    "在模型调参的场景下，需要多次调整模型超参并进行多次训练，这个过程，往往需要手动记录每次训练使用参数以及训练结果。为此，MindSpore提供了自动记录模型参数、训练信息及训练结果评估指标的功能，并通过MindInsight进行可视化展示。本次体验会从MindInsight的数据记录、可视化效果、如何方便用户在模型调优和数据调优上做一次整体流程的体验。\n",
    "\n",
    "下面按照MindSpore的训练数据模型的正常步骤进行，使用`SummaryCollector`进行数据保存操作，本次体验的整体流程如下：\n",
    "\n",
    "1. 数据集的准备，这里使用的是MNIST数据集。\n",
    "\n",
    "2. 构建一个网络，这里使用LeNet网络。\n",
    "\n",
    "3. 记录数据及启动训练。\n",
    "\n",
    "4. 启动MindInsight服务。\n",
    "\n",
    "5. 溯源分析的使用。\n",
    "\n",
    "6. 对比分析的使用。\n",
    "\n",
    "\n",
    "本次体验将使用快速入门案例作为基础用例，将MindInsight的溯源分析和对比分析的数据记录功能加入到案例中。\n",
    "\n",
    "> 快速入门案例的源码请参考：<https://gitee.com/mindspore/docs/blob/master/tutorials/tutorial_code/lenet/lenet.py>。\n",
    "\n",
    "> 本文档适用于GPU环境。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 数据集准备"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 数据集下载"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "本例使用数据集为`MNIST_Data`，数据集下载地址：<br/>训练数据集：{\"<http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz>\", \"<http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz>\"}\n",
    "<br/>测试数据集：{\"<http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz>\", \"<http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz>\"}<br/>\n",
    "下载后，将数据集解压出来，按照如下目录结构放置：\n",
    " ```\n",
    "├── lineage_and_scalars_comparision.ipynb\n",
    "└── datasets\n",
    "    └── MNIST_Data\n",
    "        ├── test\n",
    "        │   ├── t10k-images-idx3-ubyte\n",
    "        │   └── t10k-labels-idx1-ubyte\n",
    "        └── train\n",
    "            ├── train-images-idx3-ubyte\n",
    "            └── train-labels-idx1-ubyte\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "其中`lineage_and_scalars_comparision.ipynb`为本文文档。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 数据集处理"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "数据集处理对于训练非常重要，好的数据集可以有效提高训练精度和效率。在加载数据集前，我们通常会对数据集进行一些处理。\n",
    "<br/>我们定义一个函数`create_dataset`来创建数据集。在这个函数中，我们定义好需要进行的数据增强和处理操作：\n",
    "\n",
    "1. 定义数据集。\n",
    "2. 定义进行数据增强和处理所需要的一些参数。\n",
    "3. 根据参数，生成对应的数据增强操作。\n",
    "4. 使用`map`映射函数，将数据操作应用到数据集。\n",
    "5. 对生成的数据集进行处理。\n",
    "\n",
    "具体的数据集操作可以在MindInsight的数据溯源中进行可视化分析。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import mindspore.dataset.vision.c_transforms as CV\n",
    "import mindspore.dataset.transforms.c_transforms as C\n",
    "from mindspore.dataset.vision import Inter\n",
    "from mindspore.common import dtype as mstype\n",
    "import mindspore.dataset as ds\n",
    "\n",
    "def create_dataset(data_path, batch_size=16, repeat_size=1,\n",
    "                   num_parallel_workers=1):\n",
    "    \"\"\" create dataset for train or test\n",
    "    Args:\n",
    "        data_path (str): Data path\n",
    "        batch_size (int): The number of data records in each group\n",
    "        repeat_size (int): The number of replicated data records\n",
    "        num_parallel_workers (int): The number of parallel workers\n",
    "    \"\"\"\n",
    "    # define dataset\n",
    "    mnist_ds = ds.MnistDataset(data_path)\n",
    "\n",
    "    # define some parameters needed for data enhancement and rough justification\n",
    "    resize_height, resize_width = 32, 32\n",
    "    rescale = 1.0 / 255.0\n",
    "    shift = 0.0\n",
    "    rescale_nml = 1 / 0.3081\n",
    "    shift_nml = -1 * 0.1307 / 0.3081\n",
    "\n",
    "    # according to the parameters, generate the corresponding data enhancement method\n",
    "    resize_op = CV.Resize((resize_height, resize_width), interpolation=Inter.LINEAR)\n",
    "    # if you need to use SummaryCollector to extract image data, do not use the following normalize operator operation\n",
    "    rescale_nml_op = CV.Rescale(rescale_nml, shift_nml)\n",
    "    rescale_op = CV.Rescale(rescale, shift)\n",
    "    hwc2chw_op = CV.HWC2CHW()\n",
    "    type_cast_op = C.TypeCast(mstype.int32)\n",
    "\n",
    "    # using map method to apply operations to a dataset\n",
    "    mnist_ds = mnist_ds.map(operations=type_cast_op, input_columns=\"label\", num_parallel_workers=num_parallel_workers)\n",
    "    mnist_ds = mnist_ds.map(operations=resize_op, input_columns=\"image\", num_parallel_workers=num_parallel_workers)\n",
    "    mnist_ds = mnist_ds.map(operations=rescale_op, input_columns=\"image\", num_parallel_workers=num_parallel_workers)\n",
    "    mnist_ds = mnist_ds.map(operations=rescale_nml_op, input_columns=\"image\", num_parallel_workers=num_parallel_workers)\n",
    "    mnist_ds = mnist_ds.map(operations=hwc2chw_op, input_columns=\"image\", num_parallel_workers=num_parallel_workers)\n",
    "    \n",
    "    # process the generated dataset\n",
    "    buffer_size = 10000\n",
    "    mnist_ds = mnist_ds.shuffle(buffer_size=buffer_size)  # 10000 as in LeNet train script\n",
    "    mnist_ds = mnist_ds.batch(batch_size, drop_remainder=True)\n",
    "    mnist_ds = mnist_ds.repeat(repeat_size)\n",
    "\n",
    "    return mnist_ds"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 定义LeNet5网络"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "本例采用的网络模型为LeNet5网络，对于手写数字分类表现得非常出色，网络模型定义如下。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "from mindspore.ops as ops\n",
    "import mindspore.nn as nn\n",
    "from mindspore.common.initializer import Normal\n",
    "\n",
    "class LeNet5(nn.Cell):\n",
    "    \"\"\"Lenet network structure.\"\"\"\n",
    "    def __init__(self):\n",
    "        super(LeNet5, self).__init__()\n",
    "        self.batch_size = 32 \n",
    "        self.conv1 = nn.Conv2d(1, 6, 5, pad_mode=\"valid\")\n",
    "        self.conv2 = nn.Conv2d(6, 16, 5, pad_mode=\"valid\")\n",
    "        self.fc1 = nn.Dense(16 * 5 * 5, 120, Normal(0.02), Normal(0.02))\n",
    "        self.fc2 = nn.Dense(120, 84, Normal(0.02), Normal(0.02))\n",
    "        self.fc3 = nn.Dense(84, 10)\n",
    "        self.relu = nn.ReLU()\n",
    "        self.max_pool2d = nn.MaxPool2d(kernel_size=2, stride=2)\n",
    "        self.flatten = nn.Flatten()\n",
    "        self.image_summary = ops.ImageSummary()\n",
    "        self.tensor_summary = ops.TensorSummary()\n",
    "\n",
    "    def construct(self, x):\n",
    "        self.image_summary(\"image\", x)\n",
    "        self.tensor_summary(\"tensor\", x)\n",
    "        x = self.max_pool2d(self.relu(self.conv1(x)))\n",
    "        x = self.max_pool2d(self.relu(self.conv2(x)))\n",
    "        x = self.flatten(x)\n",
    "        x = self.relu(self.fc1(x))\n",
    "        x = self.relu(self.fc2(x))\n",
    "        x = self.fc3(x)\n",
    "        return x"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 记录数据及启动训练"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "MindSpore 提供 `SummaryCollector` 接口来记录训练过程中的信息。\n",
    "\n",
    "为了更好的体验溯源分析和对比分析的效果，这里将调整学习率（`learning_rate`）、迭代次数（`epoch_size`）、batch数量（`batch_size`）来多次训练模型，并使用`SummaryCollector`保存对应的数据。\n",
    "\n",
    "`learning_rate`取值分别为0.01和0.05。\n",
    "\n",
    "`epoch_size`取值分别为2和5。\n",
    "\n",
    "`batch_size`取值分别为16和32。\n",
    "\n",
    "每次调整一个参数进行训练，总共分2x2x2=8组参数。\n",
    "\n",
    "`SummaryCollector`的更多用法，请参考[API文档](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.train.html#mindspore.train.callback.SummaryCollector)。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "================= The Situation 1 =================\n",
      "== learning_rate:0.01, epoch_size:2, batch_size:16 ==\n",
      "================ Starting Training ================\n",
      "epoch: 1 step: 125, loss is 2.306003\n",
      "epoch: 1 step: 250, loss is 2.27158\n",
      "epoch: 1 step: 375, loss is 2.2932913\n",
      "... ...\n",
      "epoch: 2 step: 3500, loss is 0.012429783\n",
      "epoch: 2 step: 3625, loss is 0.03963431\n",
      "epoch: 2 step: 3750, loss is 0.0032815132\n",
      "================ Starting Testing ================\n",
      "============ Accuracy:{'Accuracy': 0.9768} ============\n",
      "\n",
      "\n",
      "================= The Situation 2 =================\n",
      "== learning_rate:0.01, epoch_size:2, batch_size:32 ==\n",
      "================ Starting Training ================\n",
      "epoch: 1 step: 125, loss is 2.2971187\n",
      "epoch: 1 step: 250, loss is 2.298027\n",
      "epoch: 1 step: 375, loss is 2.2967234\n",
      "... ...\n",
      "epoch: 2 step: 1625, loss is 0.075038366\n",
      "epoch: 2 step: 1750, loss is 0.13549422\n",
      "epoch: 2 step: 1875, loss is 0.027064912\n",
      "================ Starting Testing ================\n",
      "============ Accuracy:{'Accuracy': 0.9751} ============\n",
      "\n",
      "\n",
      "================= The Situation 3 =================\n",
      "== learning_rate:0.01, epoch_size:5, batch_size:16 ==\n",
      "================ Starting Training ================\n",
      "epoch: 1 step: 125, loss is 2.302981\n",
      "epoch: 1 step: 250, loss is 2.322746\n",
      "epoch: 1 step: 375, loss is 2.2958326\n",
      "... ...\n",
      "epoch: 4 step: 1000, loss is 0.020238714\n",
      "epoch: 4 step: 1125, loss is 0.0589997\n",
      "... ...\n",
      "epoch: 5 step: 3625, loss is 0.0015392168\n",
      "epoch: 5 step: 3750, loss is 0.0021714682\n",
      "================ Starting Testing ================\n",
      "============ Accuracy:{'Accuracy': 0.9838} ============\n",
      "\n",
      "\n",
      "================= The Situation 4 =================\n",
      "== learning_rate:0.01, epoch_size:5, batch_size:32 ==\n",
      "================ Starting Training ================\n",
      "epoch: 1 step: 125, loss is 2.29332\n",
      "epoch: 1 step: 250, loss is 2.3053114\n",
      "epoch: 1 step: 375, loss is 2.2936137\n",
      "... ...\n",
      "epoch: 5 step: 1625, loss is 0.0010452892\n",
      "epoch: 5 step: 1750, loss is 0.08312219\n",
      "epoch: 5 step: 1875, loss is 0.00804067\n",
      "================ Starting Testing ================\n",
      "============ Accuracy:{'Accuracy': 0.9821} ============\n",
      "\n",
      "\n",
      "================= The Situation 5 =================\n",
      "== learning_rate:0.05, epoch_size:2, batch_size:16 ==\n",
      "================ Starting Training ================\n",
      "epoch: 1 step: 125, loss is 2.2664146\n",
      "epoch: 1 step: 250, loss is 2.3064432\n",
      "epoch: 1 step: 375, loss is 2.2938812\n",
      "... ...\n",
      "epoch: 2 step: 3500, loss is 2.3222156\n",
      "epoch: 2 step: 3625, loss is 2.3353317\n",
      "epoch: 2 step: 3750, loss is 2.3496826\n",
      "================ Starting Testing ================\n",
      "============ Accuracy:{'Accuracy': 0.1135} ============\n",
      "\n",
      "\n",
      "================= The Situation 6 =================\n",
      "== learning_rate:0.05, epoch_size:2, batch_size:32 ==\n",
      "================ Starting Training ================\n",
      "epoch: 1 step: 125, loss is 2.2778828\n",
      "epoch: 1 step: 250, loss is 2.2844431\n",
      "epoch: 1 step: 375, loss is 1.1201912\n",
      "... ...\n",
      "epoch: 2 step: 1625, loss is 1.959749\n",
      "epoch: 2 step: 1750, loss is 1.9234474\n",
      "epoch: 2 step: 1875, loss is 2.0599048\n",
      "================ Starting Testing ================\n",
      "============ Accuracy:{'Accuracy': 0.206} ============\n",
      "\n",
      "\n",
      "================= The Situation 7 =================\n",
      "== learning_rate:0.05, epoch_size:5, batch_size:16 ==\n",
      "================ Starting Training ================\n",
      "epoch: 1 step: 125, loss is 2.3795776\n",
      "epoch: 1 step: 250, loss is 2.3376794\n",
      "epoch: 1 step: 375, loss is 2.414643\n",
      "... ...\n",
      "epoch: 5 step: 3500, loss is 2.3004642\n",
      "epoch: 5 step: 3625, loss is 2.2935314\n",
      "epoch: 5 step: 3750, loss is 2.2954135\n",
      "================ Starting Testing ================\n",
      "============ Accuracy:{'Accuracy': 0.1009} ============\n",
      "\n",
      "\n",
      "================= The Situation 8 =================\n",
      "== learning_rate:0.05, epoch_size:5, batch_size:32 ==\n",
      "================ Starting Training ================\n",
      "epoch: 1 step: 125, loss is 2.3070464\n",
      "epoch: 1 step: 250, loss is 2.321302\n",
      "epoch: 1 step: 375, loss is 2.3705728\n",
      "... ...\n",
      "epoch: 2 step: 500, loss is 2.3684013\n",
      "epoch: 2 step: 625, loss is 2.245975\n",
      "epoch: 2 step: 750, loss is 1.7440267\n",
      "... ...\n",
      "epoch: 5 step: 1750, loss is 2.3076403\n",
      "epoch: 5 step: 1875, loss is 2.307393\n",
      "================ Starting Testing ================\n",
      "============ Accuracy:{'Accuracy': 0.1028} ============\n",
      "\n",
      "\n"
     ]
    }
   ],
   "source": [
    "from mindspore.train.callback import SummaryCollector\n",
    "from mindspore.nn.metrics import Accuracy\n",
    "from mindspore import context\n",
    "from mindspore.nn.loss import SoftmaxCrossEntropyWithLogits\n",
    "from mindspore.train.serialization import load_checkpoint, load_param_into_net\n",
    "from mindspore.train.callback import ModelCheckpoint, CheckpointConfig, LossMonitor\n",
    "from mindspore.train import Model\n",
    "import os\n",
    "\n",
    "if __name__==\"__main__\":\n",
    "    context.set_context(mode=context.GRAPH_MODE, device_target = \"GPU\")\n",
    "    if os.name == \"nt\":\n",
    "        os.system(\"del/f/s/q *.ckpt *.meta\")\n",
    "    else:\n",
    "        os.system(\"rm -f *.ckpt *.meta *.pb\")\n",
    "\n",
    "    mnist_path = \"./datasets/MNIST_Data/\"\n",
    "    model_path = \"./models/ckpt/lineage_and_scalars_comparision/\"\n",
    "    repeat_size = 1\n",
    "    config_ck = CheckpointConfig(save_checkpoint_steps=1875, keep_checkpoint_max=10)\n",
    "    ckpoint_cb = ModelCheckpoint(prefix=\"checkpoint_lenet\", directory=model_path, config=config_ck)\n",
    "    # define the optimizer\n",
    "    \n",
    "    lrs = [0.01,0.05]\n",
    "    epoch_sizes = [2, 5]\n",
    "    batch_sizes = [16, 32]\n",
    "    situations = [(i, j, k) for i in lrs for j in epoch_sizes for k in batch_sizes]\n",
    "    count = 1\n",
    "    \n",
    "    for lr,epoch_size,batch_size in situations:\n",
    "        momentum = 0.9 \n",
    "        network = LeNet5()\n",
    "        net_loss = SoftmaxCrossEntropyWithLogits(sparse=True, reduction='mean')\n",
    "        net_opt = nn.Momentum(network.trainable_params(), lr, momentum)\n",
    "        model = Model(network, net_loss, net_opt, metrics={\"Accuracy\": Accuracy()})\n",
    "        summary_collector = SummaryCollector(summary_dir=\"./summary_base/LeNet-MNIST_Data,lr:{},epoch:{},batch_size:{}\"\n",
    "                                             .format(lr, epoch_size, batch_size), collect_freq=1)\n",
    "        # Start to train\n",
    "        print(\"================= The Situation {} =================\".format(count))\n",
    "        print(\"== learning_rate:{}, epoch_size:{}, batch_size:{} ==\".format(lr, epoch_size, batch_size))\n",
    "        print(\"================ Starting Training ================\")\n",
    "        ds_train = create_dataset(os.path.join(mnist_path, \"train\"), batch_size, repeat_size)\n",
    "        model.train(epoch_size, ds_train, callbacks=[ckpoint_cb, summary_collector, LossMonitor(125)], dataset_sink_mode=True)\n",
    "\n",
    "        print(\"================ Starting Testing ================\")\n",
    "        # load testing dataset\n",
    "        ds_eval = create_dataset(os.path.join(mnist_path, \"test\"))\n",
    "        acc = model.eval(ds_eval, callbacks=[summary_collector], dataset_sink_mode=True)\n",
    "        print(\"============ Accuracy:{} ============\\n\\n\".format(acc))\n",
    "        count += 1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 启动及关闭MindInsight服务"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "这里主要展示如何启用及关闭MindInsight，更多的命令集信息，请参考MindSpore官方网站：<https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/visualization_tutorials.html>。\n",
    "\n",
    "启动MindInsight服务命令：\n",
    "```\n",
    "mindinsight start --summary-base-dir=./summary_base --port=8080\n",
    "```\n",
    "\n",
    "- `--summary-base-dir`：MindInsight指定启动工作路径的命令；`./summary_base` 为 `SummaryCollector` 的 `summary_dir` 参数所指定的目录。\n",
    "- `--port`：MindInsight指定启动的端口，数值可以任意为1~65535的范围内。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "停止MindInsight服务命令：\n",
    "```\n",
    "mindinsight stop --port=8080\n",
    "```\n",
    "- `mindinsight stop`：MindInsight关闭服务命令。\n",
    "- `--port=8080`：即MindInsight服务开启在`8080`端口，所以这里写成`--port=8080`。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 溯源分析\n",
    "\n",
    "### 连接到溯源分析地址\n",
    "\n",
    "浏览器中输入:`http://127.0.0.1:8080`进入MindInsight界面如下："
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![image](https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/mindinsight/images/mindinsight_homepage_for_lineage.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 模型溯源界面介绍"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "上图训练列表中序号1-8分别是按照8组训练参数，保存的训练数据。点击右上角的溯源分析便可以进入，溯源分析包含模型溯源和数据溯源，首先是模型溯源界面，如下所示："
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![image](https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/mindinsight/images/model_lineage_page.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 优化目标区域\n",
    "\n",
    "可以选择模型精度值（Accuracy）或模型损失值（loss）。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![image](https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/mindinsight/images/optimization_target_page_of_model_lineage.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "能直观的看出`learning_rate`、`epoch`、`batch_size`三个参数对本次训练模型的精度值和损失值的参数重要性（参数重要性的数值越接近1表示对此优化目标的影响越大，越接近0则表示对优化目标的影响越小），方便用户决策在训练时需要调整的参数。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 模型训练的详细参数展示界面"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "展示界面中提供了模型训练过程中的各类重要参数信息，包括：网络、优化器、训练样本数量、测试样本数量、学习率、迭代次数、`batch_size`、`device`数目、模型大小、损失函数等等，用户可以自行选择单次训练数据进行可视化分析或者多次训练数据进行可视化比对分析，提高分析效率。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![image](https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/mindinsight/images/detailed_information_page_of_model_lineage.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 数据溯源界面介绍"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "数据溯源展示了用户进行模型训练前的数据增强的过程，且此过程按照增强顺序进行排列。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![image](https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/mindinsight/images/data_lineage_page.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "本例中数据增强的过程包括`MnistDataset`，`Map_TypeCast`，`Map_Resize`，`Map_Rescale`，`Map_HWC2CHW`，`Shuffle`，`Batch`等操作。\n",
    "\n",
    "- 数据集转换（`MnistDataset`）\n",
    "- label的数据类型转换（`Map_TypeCast`）\n",
    "- 图像的高宽缩放（`Map_Resize`）\n",
    "- 图像的比例缩放（`Map_Rescale`）\n",
    "- 图像数据的张量变换（`Map_HWC2CHW`）\n",
    "- 图像混洗（`Shuffle`）\n",
    "- 图像成组（`Batch`）"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 对比分析"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 进入对比分析界面\n",
    "\n",
    "从MindInsight主页进入对比分析界面。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![image](https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/mindinsight/images/mindinsight_homepage_for_scalars_comparision.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "从对比分析界面中可以对比不同的训练中的标量信息，本例使用`SummaryCollector`自动保存了loss值，其他的标量信息保存，请参考[官方文档](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.ScalarSummary)。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![image](https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/mindinsight/images/scalars_comparision_page.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "对比看板中可以选择对比的信息有：\n",
    "\n",
    "- 训练选择：本例有8组不同的训练参数对应的训练信息可供选择，此次选择了其中学习率（lr）分别为0.01和0.05的两组训练过程的数据进行对比。\n",
    "- 标签选择：本例保存了loss值一种标量标签。\n",
    "\n",
    "> 对比曲线可通过调整平滑度来优化显示效果。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 总结"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "本次体验使用了MindSpore的数据收集接口`SummaryCollector`对不同训练参数下的模型训练信息进行收集，并且通过开启MindInsight服务将溯源信息和标量信息进行可视化展示，以上就是本次体验的全部内容。"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
